A Note on Berger, Sellke / the Irreconcilability of P-values and Evidence
نویسنده
چکیده
We place the paper [Berger-Sellke] By Berger and Sellke in context of the discussion about the validity of p-values and alternative methods for quantifying evidence against a point null hypothesis. 1. Prologue: Quantifying Evidence against a Point null hypothesis Fisher's p-values are everywhere in empirical science. Quoting a recent provocative paper [1] which addressed the (ir)reproducibility crisis in medical science, Research is not most appropriately represented and summarized by p-values, but, unfortunately, there is a widespread notion that medical research articles should be interpreted based only on p-values . Somewhat more belligerent is [2]: And we, as teachers, consultants, authors, and otherwise perpetrators of quantitative methods, are responsible for the ritualization of null hypothesis signi cance testing [...] to the point of meaninglessness and beyond. This is a short review of the part philosophical, part statistical, part scienti c discussion within the statistical community about the Big Question: What is the correct way to quantify and weight empirical evidence against a point null hypothesis? For the purpose of this short text, it would be easier to focus on the simplest and most common case, so we will consider instead on the Small Question: Given a sample x = (x1, . . . , xn) of (X1, . . . , Xn) ∼ N ( θ, σ2 ) iid (σ2 known), what is the correct way to quantify and weight empirical evidence against the hypothesis of no e ect, H0 : θ = 0 ? Date: June 29, 2010.
منابع مشابه
Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence
The problem of testing a point null hypothesis (or a "small interval" null hypothesis) isconsidered. Of interest is the relationship between the P value (or observed significance level) and conditional nd Bayesian measures of evidence against he null hypothesis. Although one might presume that a small P value indicates the presence of strong evidence against he null, such is not necessarily the...
متن کاملCalibration of ρ Values for Testing Precise Null Hypotheses
Calibration of ρ Values for Testing Precise Null Hypotheses Thomas Sellke, M. J Bayarri & James O Berger a Thomas Sellke is Professor, Statistics Department, Purdue University, West Lafayette, IN 47907-1339. M. J. Bayarri is Professor, Department of Statistics and Operations Research, University of Valencia, Burjassot, Valencia 46100, Spain. James O. Berger is Arts and Sciences Professor, Insti...
متن کاملCalibration of p Values for Testing Precise Null Hypotheses
P values are the most commonly used tool to measure evidence against a hypothesis or hypothesized model. Unfortunately, they are often incorrectly viewed as an error probability for rejection of the hypothesis or, even worse, as the posterior probability that the hypothesis is true. The fact that these interpretations can be completely misleading when testing precise hypotheses is rst reviewe...
متن کاملReconciling Bayesian and Frequentist Evidence in the One-Sided Testing Problem
For the one-sided hypothesis testing problem it is shown that it is possible to reconcile Bayesian evidence against H0, expressed in terms of the posterior probability that Ho is true, with frequentist evidence against H0, expressed in terms of the p value. In fact, for many classes of prior distributions it is shown that the infimum of the Bayesian posterior probability of Ho is equal to the p...
متن کاملReconciling Bayesian and Frequentist
For the one-sided hypothesis testing problem it is shown that it is possible to reconcile Bayesian evidence against HO' expressed in terms of the posterior probability that HO is true, with frequentist evidence against HO' expressed in terms of the p-value. In fact, for many classes of prior distributions it is shown that the infimum of the Bayesian posterior probability of HO is either equal t...
متن کامل